Generative Painting Style Using Generative Adversarial Networks (GANs)¶

This project aims to build a Deep Convolutional Generative Adversarial Network (DCGAN) that trains on images of Monet paintings to capture their qualities and then generate Monet-esque images. Several techniques will be showcased such as image augmentation for training diversification, an iterative approach to DCGAN architecture resulting in multiple model runs and submissions, and a discussion on common issues that can arise during GAN training.


This project derives from the Kaggle competition with the name Gan Getting Started (I'm Something of a Painter Myself), found here: https://www.kaggle.com/competitions/gan-getting-started/overview

You can find this project at the github repo: https://github.com/chill0121/Kaggle_Projects/tree/main/Adversarial_Painting

Table of Contents ¶


  • 1.Data Source Information
    • 1.1. Dataset Information
    • 1.2. Kaggle Information
  • 2.Setup
    • 2.1. Environment Details for Reproducility
    • 2.2. Importing the Data
  • 3.Data Preprocessing
    • 3.1. First Looks
    • 3.2. Image Augmentation
  • 4.Exploratory Data Analysis (EDA)
  • 5.Models
    • 5.1. Generative Adversarial Network (GAN)
  • 6.Results and Discussion
  • 7.Conclusion

    • 7.1. Possible Areas for Improvement
  • Appendix A - Online References

1. Data Source Information ¶


1.1. Data Information: ¶

Color images (256 x 256 pixels) extracted from histopathologic scans of lymph node sections. These 96 x 96 images are patches of a whole slide image.

  • The monet directories contain Monet paintings used to train the model.
  • The photo directories contain photos used to add the Monet-style to them for submission.

Data Info:

  • 300 Monet Painting Images
    • 256 x 256 x 3
  • 7028 Photos
    • 256 x 256 x 3

1.2. Kaggle Information: ¶

Description:¶

We recognize the works of artists through their unique style, such as color choices or brush strokes. The “je ne sais quoi” of artists like Claude Monet can now be imitated with algorithms thanks to generative adversarial networks (GANs). In this getting started competition, you will bring that style to your photos or recreate the style from scratch!

Computer vision has advanced tremendously in recent years and GANs are now capable of mimicking objects in a very convincing way. But creating museum-worthy masterpieces is thought of to be, well, more art than science. So can (data) science, in the form of GANs, trick classifiers into believing you’ve created a true Monet? That’s the challenge you’ll take on!

The Challenge: A GAN consists of at least two neural networks: a generator model and a discriminator model. The generator is a neural network that creates the images. For our competition, you should generate images in the style of Monet. This generator is trained using a discriminator.

The two models will work against each other, with the generator trying to trick the discriminator, and the discriminator trying to accurately classify the real vs. generated images.

Your task is to build a GAN that generates 7,000 to 10,000 Monet-style images.

Note: Monet-style art can be created from scratch using other GAN architectures like DCGAN. The submitted image files do not necessarily have to be transformed photos.

Evaluation:¶

MiFID Submissions are evaluated on MiFID (Memorization-informed Fréchet Inception Distance), which is a modification from Fréchet Inception Distance (FID).

The smaller MiFID is, the better your generated images are.

What is FID? Originally published here (github), FID, along with Inception Score (IS), are both commonly used in recent publications as the standard for evaluation methods of GANs.

Citation:¶

Amy Jang, Ana Sofia Uzsoy, Phil Culliton. (2020). I’m Something of a Painter Myself. Kaggle. https://kaggle.com/competitions/gan-getting-started

Back to Table of Contents¶

2. Setup ¶


In [1]:
import os
import sys
import zipfile
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from PIL import Image

import tensorflow as tf
import torch
Back to Table of Contents¶

2.1. Environment Information for Reproducibility: ¶

In [2]:
print(f"Python version: {sys.version}")

packages = [pd, np, sns, tf, torch]
for package in packages:
    print(f"{str(package).partition('from')[0]} using version: {package.__version__}")
Python version: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]
<module 'pandas'  using version: 2.2.2
<module 'numpy'  using version: 1.26.4
<module 'seaborn'  using version: 0.12.2
<module 'tensorflow'  using version: 2.16.1
<module 'torch'  using version: 2.4.0
Back to Table of Contents¶

2.2. Importing the Data: ¶

In [3]:
# Set directories
current_wdir = os.getcwd()
data_folder = '/kaggle/input/gan-getting-started/'

img_size = 256
batch_size = 16
rand_seed = 11

monet = tf.keras.utils.image_dataset_from_directory(f"{data_folder}/monet_jpg",
                                                      label_mode = None,
                                                      image_size = (img_size, img_size),
                                                      batch_size = batch_size)
monet = monet.map(lambda x: (x / 127.5) - 1)
Found 300 files.

As expected we're showing 300 images of Monet paintings.

Back to Table of Contents¶

3. Data Preprocessing ¶


3.1. First Looks: ¶

To get an idea what the paintings looks like, we should plot a few here.

In [4]:
for img in monet:
    fig, ax = plt.subplots(2, 3, sharex = True, sharey = True)
    img = ((img.numpy() + 1) * 127.5).astype(np.uint8)
    for i in range(6):
        ax[i // 3, i % 3].imshow(img[i])
    break
Back to Table of Contents¶

3.2. Image Augmentation: ¶

Training a deep learning model off of 300 images isn't that many.

Let's add some image augmentation/transformations to the dataloader, effectively increasing the size of the dataset size.

Transformations:

  • Randomly flip the image.
    • Horizontal
    • Vertical
  • Random Rotation
  • Randomly Crop the Image (interpolates pixels to maintain original image size.)
In [5]:
# Transformation object.
data_augmentation = tf.keras.Sequential([
    tf.keras.layers.RandomFlip("horizontal_and_vertical"),
    tf.keras.layers.RandomRotation(0.1),
    tf.keras.layers.RandomCrop(img_size, img_size)])

# Autotune to set buffer size.
AUTOTUNE = tf.data.AUTOTUNE

# Function to setup the transformations within the dataloader.
def prepare(ds, shuffle = False, augment = False):
  if shuffle:
    ds = ds.shuffle(1000)

  if augment:
    ds = ds.map(lambda x: (data_augmentation(x, training=True)), 
                num_parallel_calls = AUTOTUNE)

  # Prefetch buffer necessary here to ensure proper loading during training.
  return ds.prefetch(buffer_size = AUTOTUNE)

Now we can prepare the dataloader with the function we just created.

In [6]:
monet_transformed = prepare(monet, shuffle = True, augment = True)

Let's plot a few images and see the transformations in practice.

In [8]:
for img in monet_transformed:
    fig, ax = plt.subplots(2, 3, sharex = True, sharey = True)
    img = ((img.numpy() + 1) * 127.5).astype(np.uint8)
    for i in range(6):
        ax[i // 3, i % 3].imshow(img[i])
    break

Looks good, we can obviously see the rotations and a few flips even without seeing the originals.

Back to Table of Contents¶

4. Exploratory Data Analysis (EDA) ¶


Let's load the images outside the dataloader so we can avoid the batching and do a little EDA. Since there are only 300, this isn't too much of a memory hit.

In [9]:
monet_files = os.listdir(f"{data_folder}/monet_jpg")
monet_array = np.zeros((len(monet_files), 256, 256, 3)) # 4D Array with shape (n_images, height, width, channels)
for i, file in enumerate(monet_files):
    img = Image.open(f"{data_folder}/monet_jpg/{file}")
    monet_array[i] = np.asarray(img)

Since a Generative Adversarial Network is going to be trained to generate images in Monet's style from scratch, it might be good to inform ourselves of what color intensities are present within this dataset.

In [10]:
# Find the mean pixel intensity of each image (down to one value along all axes).
mean_intensity = monet_array.mean(axis = 1).mean(axis = 1).mean(axis = 1)

sns.histplot(mean_intensity, binwidth = 5, kde = True)
plt.title('Average Pixel Intensity')
plt.show()
/opt/conda/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):

Look at the mean pixel intensity value of each image here, we can see that there aren't many images skewing white (close to 255, right) or black (close to 0, left). Most images sit near the middle around 130 (127.5 would be center).

In [11]:
mean_intensity_rgb = monet_array.mean(axis = 1).mean(axis = 1)

# Plot the distribution of the mean intensities of individual RGB channels.
rgb_dict = {0:'Red', 1:'Green', 2:'Blue'}
for i in rgb_dict.keys():
    sns.histplot(mean_intensity_rgb[:,i], color = rgb_dict[i], binwidth = 5, kde = True)#, ax = ax[i])
plt.title('Average Pixel Intensity of Each Channel (RGB)')
plt.legend(list(rgb_dict.values()))
plt.show()
/opt/conda/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/conda/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/conda/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
  • There doesn't seem to be too much difference in the distribution besides maybe a skew of the blue channel.
  • The green and red distributions are very similar.

It seems Monet painted with a wide variety of colors and this should be present in the final generations.

Back to Table of Contents¶

5. Models ¶


5.1. Deep Convolutional Generative Adversarial Network (DCGAN) ¶

Instead of performing style-transfer like pix2pix or with a CycleGAN, I plan to train a DCGAN to generate Monet-like images and see how well it performs in this competition.

To review, a DCGAN is a variation of the more general GAN models that utilizes deep convolutional layers in both the discriminator and generator. This, of course, means this type of model is adept at image data.

DCGANs are comprised of two models, the Discriminator whose purpose is to identify (discriminate) real images from fake ones, and the Generator whose role is to generate images that hopefully imitate real images well enough to fool the Discriminator. The two goals of these models are diametrically opposed to each other, meaning if one is succeeding, the other is failing. This can be seen in the loss of each model, but there is no perfect metric to know when the model has converged, overfit, or suffered mode collapse (more on this later). This fact makes tuning a GAN model fairly subjective and requires balancing the two models so that they converge into a working model which be a difficult and time-consuming process.

In [12]:
latent_dim = 512
epochs = 250
num_img_saved = 1

# Image dir saved per epoch.
! mkdir /kaggle/working/Generated_Images
# Image dir saved for submission.
! mkdir /kaggle/working/images
/opt/conda/lib/python3.10/pty.py:89: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
  pid, fd = os.forkpty()

Now we can initialize the Discriminator (sometimes called the Critic depending on application).

Generally the role of the discriminator isn't unlike a CNN used in classification: downsample an image into a feature space and classify it.

It's very important to not only plan the architectures of the discriminator and generator (input shapes vs output shapes, filter size, number of filters, strides, etc), but also choose the proper activation functions, especially for the last layers.

Since we're using Binary Cross Entropy as the loss function, the last layer can either use a sigmoid activation function or ensure the loss function specifies from_logits = True.

In [13]:
discriminator = tf.keras.Sequential(
    [tf.keras.Input(shape = (img_size, img_size, 3)),
     tf.keras.layers.Conv2D(64, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
     #tf.keras.layers.BatchNormalization(),
     tf.keras.layers.LeakyReLU(negative_slope = 0.2),
     tf.keras.layers.Conv2D(128, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.LeakyReLU(negative_slope = 0.2),
     tf.keras.layers.Conv2D(128, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.LeakyReLU(negative_slope = 0.2),
     tf.keras.layers.Conv2D(256, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.LeakyReLU(negative_slope = 0.2),
     tf.keras.layers.Conv2D(256, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.LeakyReLU(negative_slope = 0.2),
     tf.keras.layers.Flatten(),
     tf.keras.layers.Dropout(0.2),
     tf.keras.layers.Dense(1)],
     name = 'Discriminator')

discriminator.summary()
Model: "Discriminator"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                 │ (None, 128, 128, 64)   │         4,800 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu (LeakyReLU)         │ (None, 128, 128, 64)   │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_1 (Conv2D)               │ (None, 64, 64, 128)    │       204,800 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization             │ (None, 64, 64, 128)    │           512 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_1 (LeakyReLU)       │ (None, 64, 64, 128)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_2 (Conv2D)               │ (None, 32, 32, 128)    │       409,600 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_1           │ (None, 32, 32, 128)    │           512 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_2 (LeakyReLU)       │ (None, 32, 32, 128)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_3 (Conv2D)               │ (None, 16, 16, 256)    │       819,200 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_2           │ (None, 16, 16, 256)    │         1,024 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_3 (LeakyReLU)       │ (None, 16, 16, 256)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_4 (Conv2D)               │ (None, 8, 8, 256)      │     1,638,400 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_3           │ (None, 8, 8, 256)      │         1,024 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_4 (LeakyReLU)       │ (None, 8, 8, 256)      │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten)               │ (None, 16384)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 16384)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 1)              │        16,385 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 3,096,257 (11.81 MB)
 Trainable params: 3,094,721 (11.81 MB)
 Non-trainable params: 1,536 (6.00 KB)

Now for the Generator.

This model requires us to architect a random latent vector as an input, then upsample this vector through an emergent feature space to the size of a real image which will then be input into the Discriminator during training. Several different model architectures can work here and many variations of GANs can be identified by the Generator's shape.

Here, we have a choice of activation function in the last layer which mostly depends on the range of pixel values of our images. In this case we normalized the images into the space of [-1, 1] so the hyperbolic tangent (tanh) activation function has been chosen. If we had normalized from [0, 1], the sigmoid function would be appropriate. Tanh seems to be accepted practice by performing the best in GANs, however.

In [14]:
# # Shallow DCGAN
# generator = tf.keras.Sequential(
#     [tf.keras.Input(shape = (latent_dim, )),
#      tf.keras.layers.Dense(16 * 16 * latent_dim),
#      tf.keras.layers.Reshape((16, 16, latent_dim)),
#      tf.keras.layers.Conv2DTranspose(256, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
#      #tf.keras.layers.BatchNormalization(),
#      tf.keras.layers.LeakyReLU(negative_slope = 0.2),
#      tf.keras.layers.Conv2DTranspose(256, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
#      tf.keras.layers.BatchNormalization(),
#      tf.keras.layers.LeakyReLU(negative_slope = 0.2),
#      tf.keras.layers.Conv2DTranspose(512, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
#      tf.keras.layers.BatchNormalization(),
#      tf.keras.layers.LeakyReLU(negative_slope = 0.2),
#      tf.keras.layers.Conv2DTranspose(512, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
#      tf.keras.layers.BatchNormalization(),
#      tf.keras.layers.LeakyReLU(negative_slope = 0.2),
#      tf.keras.layers.Conv2D(3, kernel_size = 7, padding = 'same', activation = 'tanh')],
#      name = 'Generator')
# generator.summary()

# # Flipped DCGAN
# generator = tf.keras.Sequential(
#     [tf.keras.Input(shape = (latent_dim, )),
#      tf.keras.layers.Dense(4 * 4 * latent_dim),
#      tf.keras.layers.Reshape((4, 4, latent_dim)),
#      tf.keras.layers.Conv2DTranspose(512, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
#      #tf.keras.layers.BatchNormalization(),
#      tf.keras.layers.LeakyReLU(negative_slope = 0.2),
#      tf.keras.layers.Conv2DTranspose(256, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
#      tf.keras.layers.BatchNormalization(),
#      tf.keras.layers.LeakyReLU(negative_slope = 0.2),
#      tf.keras.layers.Conv2DTranspose(128, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
#      tf.keras.layers.BatchNormalization(),
#      tf.keras.layers.LeakyReLU(negative_slope = 0.2),
#      tf.keras.layers.Conv2DTranspose(128, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
#      tf.keras.layers.BatchNormalization(),
#      tf.keras.layers.LeakyReLU(negative_slope = 0.2),
#      tf.keras.layers.Conv2D(3, kernel_size = 7, padding = 'same', activation = 'tanh')],
#      name = 'Generator')
# generator.summary()

# Deeper DCGAN
generator = tf.keras.Sequential(
    [tf.keras.Input(shape = (latent_dim, )),
     tf.keras.layers.Dense(16 * 16 * latent_dim),
     tf.keras.layers.Reshape((16, 16, latent_dim)),
     tf.keras.layers.Conv2DTranspose(512, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
     #tf.keras.layers.BatchNormalization(),
     tf.keras.layers.LeakyReLU(negative_slope = 0.2),
     tf.keras.layers.Conv2DTranspose(256, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.LeakyReLU(negative_slope = 0.2),
     tf.keras.layers.Conv2DTranspose(128, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.LeakyReLU(negative_slope = 0.2),
     tf.keras.layers.Conv2DTranspose(64, kernel_size = 5, strides = 2, padding = 'same', use_bias = False),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.LeakyReLU(negative_slope = 0.2),
     tf.keras.layers.Conv2D(32, kernel_size = 5, padding = 'same', use_bias = False),
     tf.keras.layers.BatchNormalization(),
     tf.keras.layers.LeakyReLU(negative_slope = 0.2),
     tf.keras.layers.Conv2D(3, kernel_size = 7, padding = 'same', activation = 'tanh')],
     name = 'Generator')
generator.summary()
Model: "Generator"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense_1 (Dense)                 │ (None, 131072)         │    67,239,936 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ reshape (Reshape)               │ (None, 16, 16, 512)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_transpose                │ (None, 32, 32, 512)    │     6,553,600 │
│ (Conv2DTranspose)               │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_5 (LeakyReLU)       │ (None, 32, 32, 512)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_transpose_1              │ (None, 64, 64, 256)    │     3,276,800 │
│ (Conv2DTranspose)               │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_4           │ (None, 64, 64, 256)    │         1,024 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_6 (LeakyReLU)       │ (None, 64, 64, 256)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_transpose_2              │ (None, 128, 128, 128)  │       819,200 │
│ (Conv2DTranspose)               │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_5           │ (None, 128, 128, 128)  │           512 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_7 (LeakyReLU)       │ (None, 128, 128, 128)  │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_transpose_3              │ (None, 256, 256, 64)   │       204,800 │
│ (Conv2DTranspose)               │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_6           │ (None, 256, 256, 64)   │           256 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_8 (LeakyReLU)       │ (None, 256, 256, 64)   │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_5 (Conv2D)               │ (None, 256, 256, 32)   │        51,200 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_7           │ (None, 256, 256, 32)   │           128 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_9 (LeakyReLU)       │ (None, 256, 256, 32)   │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_6 (Conv2D)               │ (None, 256, 256, 3)    │         4,707 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 78,152,163 (298.13 MB)
 Trainable params: 78,151,203 (298.12 MB)
 Non-trainable params: 960 (3.75 KB)

Now we need a way for these two models to be set up opposing each other, in an adversarial way. The following class will handle the training program.

In [15]:
# GAN Class was altered from https://keras.io/examples/generative/dcgan_overriding_train_step/
class GAN(tf.keras.Model):
    def __init__(self, discriminator, generator, latent_dim):
        super().__init__()
        self.discriminator = discriminator
        self.generator = generator
        self.latent_dim = latent_dim
        # self.seed_generator = tf.keras.random.SeedGenerator(rand_seed)

    def compile(self, d_optimizer, g_optimizer, loss_fn):
        super().compile()
        self.d_optimizer = d_optimizer
        self.g_optimizer = g_optimizer
        self.loss_fn = loss_fn
        self.d_loss_metric = tf.keras.metrics.Mean(name = "D_Loss")
        self.g_loss_metric = tf.keras.metrics.Mean(name = "G_Loss")

    @property
    def metrics(self):
        return [self.d_loss_metric, self.g_loss_metric]

    def train_step(self, real_images):
        # Sample random points in the latent space.
        batch_size = tf.keras.ops.shape(real_images)[0]
        random_latent_vectors = tf.keras.random.normal(
            shape=(batch_size, self.latent_dim))#, seed=self.seed_generator)

        # Decode the latent space to a fake image.
        generated_images = self.generator(random_latent_vectors)

        # Zip with real images.
        combined_images = tf.keras.ops.concatenate([generated_images, real_images], axis=0)

        # # Assemble labels discriminating real from fake images.
        # labels = tf.keras.ops.concatenate([tf.keras.ops.ones((batch_size, 1)), tf.keras.ops.zeros((batch_size, 1))], axis=0)
        
        # # Add noise to labels.
        # labels += 0.05 * tf.random.uniform(tf.shape(labels))

        # Create labels for real (1) and fake (0) images
        real_labels = tf.ones((batch_size, 1))
        fake_labels = tf.zeros((batch_size, 1))

        # Apply label smoothing: change real labels from 1 to a value like 0.9
        smoothed_real_labels = real_labels * 0.9  # Real label smoothing
        combined_labels = tf.concat([smoothed_real_labels, fake_labels], axis=0)

        # Add some noise to the labels (optional but can help)
        combined_labels += 0.05 * tf.random.uniform(tf.shape(combined_labels))

        # Train the discriminator.
        with tf.GradientTape() as tape:
            predictions = self.discriminator(combined_images)
            d_loss = self.loss_fn(combined_labels, predictions)
        grads = tape.gradient(d_loss, self.discriminator.trainable_weights)
        self.d_optimizer.apply_gradients(
            zip(grads, self.discriminator.trainable_weights))

        # Sample random points in the latent space.
        random_latent_vectors = tf.keras.random.normal(
            shape=(batch_size, self.latent_dim))#, seed=self.seed_generator)

        # Assemble misleading labels that show all real images.
        misleading_labels = tf.keras.ops.zeros((batch_size, 1))

        # Train the generator.
        with tf.GradientTape() as tape:
            predictions = self.discriminator(self.generator(random_latent_vectors))
            g_loss = self.loss_fn(misleading_labels, predictions)
        grads = tape.gradient(g_loss, self.generator.trainable_weights)
        self.g_optimizer.apply_gradients(zip(grads, self.generator.trainable_weights))

        # Update loss metrics.
        self.d_loss_metric.update_state(d_loss)
        self.g_loss_metric.update_state(g_loss)
        return {"d_loss": self.d_loss_metric.result(),
                "g_loss": self.g_loss_metric.result()}

Here we can create a callback for tensorflow to use at the end of each epoch. This class and function will save randomly generated images after each epoch -- This is incredibly important for the subjective tuning and analysis of how well the model performs as mode collapse or non-convergence can happen at any time during the training of a GAN.

In [16]:
class GANMonitor(tf.keras.callbacks.Callback):
    def __init__(self, num_img=3, latent_dim=latent_dim):
        self.num_img = num_img
        self.latent_dim = latent_dim
        # self.seed_generator = tf.keras.random.SeedGenerator(rand_seed)

    # Save a few generated images at each epoch.
    def on_epoch_end(self, epoch, logs=None):
        random_latent_vectors = tf.keras.random.normal(shape=(self.num_img, self.latent_dim))
        generated_images = self.model.generator(random_latent_vectors, training = False)
        generated_images = (generated_images + 1) * 127.5
        for i in range(self.num_img):
            img = tf.keras.utils.array_to_img(generated_images[i])
            img.save(f"{current_wdir}/Generated_Images/{epoch+1:03}_{i}.jpg")
        
        # In the case of mode collapse for kaggle submissions.
        if epoch > (epochs - 21):           
        # Sorry for anyone who reads this next section, it's convoluted in its methods.
            n = 400
            random_latent_vectors = tf.keras.random.normal(shape=(n, 512))
            for i in range(int(n/100)):
                # print('####', i*100, (i+1)*100)
                generated_images = self.model.generator(random_latent_vectors[i*100: (i+1)*100], training = False)
                generated_images = (generated_images + 1) * 127.5
                n_range = np.linspace(i*100, ((i+1)*100)-1, int(n/4), dtype = int)
                for i, j in enumerate(n_range):
                    # print('-',i)
                    img = tf.keras.utils.array_to_img(generated_images[i])
                    img.save(f"{current_wdir}/images/{epoch+1:03}_{j}.jpg")
In [17]:
# Supress TF warning messages.
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
# Check to make sure GPU will be used.
tf.config.list_physical_devices('GPU')
Out[17]:
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
In [18]:
gan = GAN(discriminator = discriminator, generator = generator, latent_dim = latent_dim)
gan.compile(
    d_optimizer=tf.keras.optimizers.Adam(learning_rate = 0.0001, beta_1 = 0.5, beta_2 = 0.9),
    g_optimizer=tf.keras.optimizers.Adam(learning_rate = 0.0002, beta_1 = 0.5, beta_2 = 0.9),
    loss_fn = tf.keras.losses.BinaryCrossentropy(from_logits = True))

with tf.device('/device:GPU:0'):
    history_gan = gan.fit(monet_transformed,
                          epochs = epochs,
                          callbacks = [GANMonitor(num_img = num_img_saved, latent_dim = latent_dim)])
Epoch 1/250
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1727903939.355562     131 service.cc:145] XLA service 0x7d2888002120 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1727903939.355642     131 service.cc:153]   StreamExecutor device (0): Tesla P100-PCIE-16GB, Compute Capability 6.0
W0000 00:00:1727903939.713769     131 random_ops.cc:59] Warning: Using tf.random.uniform with XLA compilation will ignore seeds; consider using tf.random.stateless_uniform instead if reproducible behavior is desired. random_uniform/RandomUniform
I0000 00:00:1727903965.095241     131 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
19/19 ━━━━━━━━━━━━━━━━━━━━ 61s 2s/step - d_loss: 0.5995 - g_loss: 0.7060
Epoch 2/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 416ms/step - d_loss: 0.5243 - g_loss: 0.7792
Epoch 3/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.4333 - g_loss: 0.8907
Epoch 4/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 416ms/step - d_loss: 0.4048 - g_loss: 1.1171
Epoch 5/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.4802 - g_loss: 1.2327
Epoch 6/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.6898 - g_loss: 1.0059
Epoch 7/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6458 - g_loss: 1.2977
Epoch 8/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.6970 - g_loss: 0.8350
Epoch 9/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6178 - g_loss: 1.4931
Epoch 10/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6441 - g_loss: 0.8571
Epoch 11/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7234 - g_loss: 1.3230
Epoch 12/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6301 - g_loss: 1.1793
Epoch 13/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6215 - g_loss: 1.1822
Epoch 14/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6844 - g_loss: 1.0288
Epoch 15/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.5528 - g_loss: 1.8988
Epoch 16/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6483 - g_loss: 1.5720
Epoch 17/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6535 - g_loss: 0.8657
Epoch 18/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7057 - g_loss: 0.9742
Epoch 19/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6190 - g_loss: 0.7501
Epoch 20/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6663 - g_loss: 1.0221
Epoch 21/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6832 - g_loss: 1.1714
Epoch 22/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6876 - g_loss: 1.0478
Epoch 23/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.6790 - g_loss: 0.9732
Epoch 24/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6563 - g_loss: 1.0403
Epoch 25/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 10s 419ms/step - d_loss: 0.6304 - g_loss: 0.9598
Epoch 26/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6384 - g_loss: 0.8944
Epoch 27/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6799 - g_loss: 0.7423
Epoch 28/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7478 - g_loss: 1.0476
Epoch 29/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6262 - g_loss: 0.7712
Epoch 30/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7038 - g_loss: 0.8086
Epoch 31/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7019 - g_loss: 0.8409
Epoch 32/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6774 - g_loss: 0.7618
Epoch 33/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7162 - g_loss: 0.7582
Epoch 34/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6758 - g_loss: 0.7831
Epoch 35/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 424ms/step - d_loss: 0.6756 - g_loss: 0.9262
Epoch 36/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 423ms/step - d_loss: 0.6616 - g_loss: 0.7506
Epoch 37/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6739 - g_loss: 0.8385
Epoch 38/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.6175 - g_loss: 0.9005
Epoch 39/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.5860 - g_loss: 1.0323
Epoch 40/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 9s 418ms/step - d_loss: 0.7796 - g_loss: 0.8628
Epoch 41/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6822 - g_loss: 0.8258
Epoch 42/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7023 - g_loss: 1.0893
Epoch 43/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6789 - g_loss: 1.0260
Epoch 44/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6988 - g_loss: 0.8076
Epoch 45/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6752 - g_loss: 0.9751
Epoch 46/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6707 - g_loss: 0.8030
Epoch 47/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6805 - g_loss: 0.8056
Epoch 48/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.7974 - g_loss: 0.8561
Epoch 49/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7009 - g_loss: 0.7527
Epoch 50/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.6782 - g_loss: 0.8860
Epoch 51/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6883 - g_loss: 0.8074
Epoch 52/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6801 - g_loss: 0.9355
Epoch 53/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6755 - g_loss: 0.8392
Epoch 54/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6848 - g_loss: 0.8783
Epoch 55/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6637 - g_loss: 0.8493
Epoch 56/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7450 - g_loss: 0.9564
Epoch 57/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6790 - g_loss: 0.9146
Epoch 58/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6717 - g_loss: 0.9516
Epoch 59/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7065 - g_loss: 0.7331
Epoch 60/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 424ms/step - d_loss: 0.6742 - g_loss: 0.8924
Epoch 61/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.7088 - g_loss: 0.8081
Epoch 62/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6741 - g_loss: 0.8263
Epoch 63/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6826 - g_loss: 0.8423
Epoch 64/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6739 - g_loss: 0.9185
Epoch 65/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6523 - g_loss: 0.7841
Epoch 66/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6284 - g_loss: 0.8586
Epoch 67/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6374 - g_loss: 1.0529
Epoch 68/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 420ms/step - d_loss: 0.6700 - g_loss: 1.2820
Epoch 69/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6660 - g_loss: 1.3240
Epoch 70/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6689 - g_loss: 0.9486
Epoch 71/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6877 - g_loss: 0.9281
Epoch 72/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6792 - g_loss: 1.0678
Epoch 73/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7038 - g_loss: 0.9430
Epoch 74/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6771 - g_loss: 0.9297
Epoch 75/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7036 - g_loss: 0.8331
Epoch 76/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.7028 - g_loss: 0.8204
Epoch 77/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7086 - g_loss: 0.6908
Epoch 78/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6983 - g_loss: 0.6555
Epoch 79/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6984 - g_loss: 0.7405
Epoch 80/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6732 - g_loss: 0.7492
Epoch 81/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6856 - g_loss: 0.7898
Epoch 82/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7125 - g_loss: 0.8927
Epoch 83/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6991 - g_loss: 0.8302
Epoch 84/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6966 - g_loss: 0.6960
Epoch 85/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6975 - g_loss: 0.7516
Epoch 86/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7027 - g_loss: 0.7847
Epoch 87/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6611 - g_loss: 0.7931
Epoch 88/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6804 - g_loss: 0.9423
Epoch 89/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6798 - g_loss: 0.8386
Epoch 90/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 424ms/step - d_loss: 0.6982 - g_loss: 0.7087
Epoch 91/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6627 - g_loss: 0.8166
Epoch 92/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6764 - g_loss: 0.8547
Epoch 93/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 10s 419ms/step - d_loss: 0.7288 - g_loss: 0.8773
Epoch 94/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6825 - g_loss: 0.7086
Epoch 95/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6987 - g_loss: 0.7329
Epoch 96/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7006 - g_loss: 0.7386
Epoch 97/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7035 - g_loss: 0.6891
Epoch 98/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6952 - g_loss: 0.7165
Epoch 99/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6893 - g_loss: 0.6849
Epoch 100/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7324 - g_loss: 0.7419
Epoch 101/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6841 - g_loss: 0.7430
Epoch 102/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6805 - g_loss: 0.7382
Epoch 103/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6843 - g_loss: 0.7030
Epoch 104/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6991 - g_loss: 0.7525
Epoch 105/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 423ms/step - d_loss: 0.7035 - g_loss: 0.7139
Epoch 106/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7056 - g_loss: 0.6788
Epoch 107/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6975 - g_loss: 0.7715
Epoch 108/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7022 - g_loss: 0.6830
Epoch 109/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6860 - g_loss: 0.6966
Epoch 110/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7075 - g_loss: 0.7270
Epoch 111/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6850 - g_loss: 0.7324
Epoch 112/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6967 - g_loss: 0.7485
Epoch 113/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6960 - g_loss: 0.7539
Epoch 114/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6879 - g_loss: 0.7514
Epoch 115/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7537 - g_loss: 0.7860
Epoch 116/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7048 - g_loss: 0.7390
Epoch 117/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6991 - g_loss: 0.7073
Epoch 118/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6911 - g_loss: 0.8152
Epoch 119/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 420ms/step - d_loss: 0.7017 - g_loss: 0.7504
Epoch 120/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.6945 - g_loss: 0.7196
Epoch 121/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7167 - g_loss: 0.7312
Epoch 122/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6925 - g_loss: 0.8971
Epoch 123/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6895 - g_loss: 0.7018
Epoch 124/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6973 - g_loss: 0.7625
Epoch 125/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6950 - g_loss: 0.7437
Epoch 126/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6999 - g_loss: 0.7095
Epoch 127/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 10s 418ms/step - d_loss: 0.6945 - g_loss: 0.6996
Epoch 128/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7030 - g_loss: 0.7239
Epoch 129/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7057 - g_loss: 0.7000
Epoch 130/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6938 - g_loss: 0.6964
Epoch 131/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.7931 - g_loss: 0.9981
Epoch 132/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.6914 - g_loss: 0.7656
Epoch 133/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7142 - g_loss: 0.7120
Epoch 134/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6901 - g_loss: 0.6991
Epoch 135/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6903 - g_loss: 0.7597
Epoch 136/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7850 - g_loss: 0.8632
Epoch 137/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7037 - g_loss: 0.7739
Epoch 138/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 9s 420ms/step - d_loss: 0.7015 - g_loss: 0.8160
Epoch 139/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7089 - g_loss: 0.7047
Epoch 140/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6990 - g_loss: 0.6592
Epoch 141/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 424ms/step - d_loss: 0.6988 - g_loss: 0.6880
Epoch 142/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7082 - g_loss: 0.7167
Epoch 143/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6958 - g_loss: 0.6934
Epoch 144/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6856 - g_loss: 0.7042
Epoch 145/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6943 - g_loss: 0.6585
Epoch 146/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6943 - g_loss: 0.6892
Epoch 147/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6925 - g_loss: 0.6807
Epoch 148/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6928 - g_loss: 0.6965
Epoch 149/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6890 - g_loss: 0.7263
Epoch 150/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7119 - g_loss: 0.7282
Epoch 151/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 423ms/step - d_loss: 0.6665 - g_loss: 1.0832
Epoch 152/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7012 - g_loss: 0.7956
Epoch 153/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6918 - g_loss: 0.7261
Epoch 154/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6978 - g_loss: 0.7098
Epoch 155/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7008 - g_loss: 0.9592
Epoch 156/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6870 - g_loss: 0.7014
Epoch 157/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 420ms/step - d_loss: 0.6915 - g_loss: 0.7787
Epoch 158/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 420ms/step - d_loss: 0.7014 - g_loss: 0.6863
Epoch 159/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6947 - g_loss: 0.7183
Epoch 160/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7008 - g_loss: 0.6981
Epoch 161/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6943 - g_loss: 0.6933
Epoch 162/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6998 - g_loss: 0.7048
Epoch 163/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6934 - g_loss: 0.6849
Epoch 164/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7023 - g_loss: 0.7390
Epoch 165/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6957 - g_loss: 0.7058
Epoch 166/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6918 - g_loss: 0.7010
Epoch 167/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7164 - g_loss: 0.7485
Epoch 168/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6989 - g_loss: 0.7023
Epoch 169/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6918 - g_loss: 0.7291
Epoch 170/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7009 - g_loss: 0.7789
Epoch 171/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7038 - g_loss: 0.7277
Epoch 172/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.6825 - g_loss: 0.6790
Epoch 173/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6924 - g_loss: 0.7991
Epoch 174/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7046 - g_loss: 0.7772
Epoch 175/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6946 - g_loss: 0.7109
Epoch 176/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6860 - g_loss: 0.7293
Epoch 177/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7023 - g_loss: 0.8418
Epoch 178/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6849 - g_loss: 0.7553
Epoch 179/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6887 - g_loss: 0.8320
Epoch 180/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.7162 - g_loss: 0.7077
Epoch 181/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 422ms/step - d_loss: 0.7027 - g_loss: 0.7301
Epoch 182/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7041 - g_loss: 0.7223
Epoch 183/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7202 - g_loss: 0.7429
Epoch 184/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7032 - g_loss: 0.7834
Epoch 185/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6992 - g_loss: 0.6879
Epoch 186/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.6963 - g_loss: 0.6755
Epoch 187/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6995 - g_loss: 0.6724
Epoch 188/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 424ms/step - d_loss: 0.6895 - g_loss: 0.7052
Epoch 189/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6989 - g_loss: 0.6796
Epoch 190/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6910 - g_loss: 0.7224
Epoch 191/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6980 - g_loss: 0.6790
Epoch 192/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6928 - g_loss: 0.7007
Epoch 193/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 423ms/step - d_loss: 0.7003 - g_loss: 0.7596
Epoch 194/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 423ms/step - d_loss: 0.7027 - g_loss: 0.8983
Epoch 195/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7044 - g_loss: 0.7506
Epoch 196/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6940 - g_loss: 0.6767
Epoch 197/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6963 - g_loss: 0.8181
Epoch 198/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6890 - g_loss: 0.6946
Epoch 199/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6910 - g_loss: 0.6763
Epoch 200/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 420ms/step - d_loss: 0.7095 - g_loss: 0.7940
Epoch 201/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6859 - g_loss: 0.7048
Epoch 202/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7209 - g_loss: 0.7057
Epoch 203/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6930 - g_loss: 0.7200
Epoch 204/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7118 - g_loss: 0.7212
Epoch 205/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 423ms/step - d_loss: 0.7359 - g_loss: 0.8513
Epoch 206/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6756 - g_loss: 0.7476
Epoch 207/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 423ms/step - d_loss: 0.7004 - g_loss: 0.7191
Epoch 208/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7035 - g_loss: 0.7414
Epoch 209/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6919 - g_loss: 0.7030
Epoch 210/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 420ms/step - d_loss: 0.7285 - g_loss: 0.7139
Epoch 211/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6932 - g_loss: 0.7280
Epoch 212/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.6923 - g_loss: 0.7073
Epoch 213/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6954 - g_loss: 0.6527
Epoch 214/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7050 - g_loss: 0.6984
Epoch 215/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7004 - g_loss: 0.7032
Epoch 216/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6932 - g_loss: 0.7137
Epoch 217/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.7011 - g_loss: 0.6766
Epoch 218/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 417ms/step - d_loss: 0.6915 - g_loss: 0.7087
Epoch 219/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.7835 - g_loss: 0.8190
Epoch 220/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6840 - g_loss: 0.7050
Epoch 221/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6913 - g_loss: 0.6939
Epoch 222/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6988 - g_loss: 0.6882
Epoch 223/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 423ms/step - d_loss: 0.6953 - g_loss: 0.7271
Epoch 224/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 10s 419ms/step - d_loss: 0.6903 - g_loss: 0.6656
Epoch 225/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6964 - g_loss: 0.6751
Epoch 226/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 419ms/step - d_loss: 0.6952 - g_loss: 0.6585
Epoch 227/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6986 - g_loss: 0.6432
Epoch 228/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6977 - g_loss: 0.6799
Epoch 229/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.6902 - g_loss: 0.7503
Epoch 230/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 8s 418ms/step - d_loss: 0.7005 - g_loss: 0.7136
Epoch 231/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 15s 808ms/step - d_loss: 0.7005 - g_loss: 0.7074
Epoch 232/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 557ms/step - d_loss: 0.6989 - g_loss: 0.6910
Epoch 233/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 558ms/step - d_loss: 0.6937 - g_loss: 0.6667
Epoch 234/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 559ms/step - d_loss: 0.7009 - g_loss: 0.6801
Epoch 235/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 557ms/step - d_loss: 0.7071 - g_loss: 0.7210
Epoch 236/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 558ms/step - d_loss: 0.6984 - g_loss: 0.6520
Epoch 237/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 559ms/step - d_loss: 0.7038 - g_loss: 0.6639
Epoch 238/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 557ms/step - d_loss: 0.6948 - g_loss: 0.7839
Epoch 239/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 558ms/step - d_loss: 0.6966 - g_loss: 0.7014
Epoch 240/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 558ms/step - d_loss: 0.6804 - g_loss: 0.7229
Epoch 241/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 559ms/step - d_loss: 0.7481 - g_loss: 0.7537
Epoch 242/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 559ms/step - d_loss: 0.7272 - g_loss: 0.7584
Epoch 243/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 559ms/step - d_loss: 0.7036 - g_loss: 0.6919
Epoch 244/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 562ms/step - d_loss: 0.7078 - g_loss: 0.7371
Epoch 245/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 558ms/step - d_loss: 0.6785 - g_loss: 0.7327
Epoch 246/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 558ms/step - d_loss: 0.6924 - g_loss: 0.7303
Epoch 247/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 558ms/step - d_loss: 0.7446 - g_loss: 0.8061
Epoch 248/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 560ms/step - d_loss: 0.7063 - g_loss: 0.7100
Epoch 249/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 557ms/step - d_loss: 0.6967 - g_loss: 0.7071
Epoch 250/250
19/19 ━━━━━━━━━━━━━━━━━━━━ 11s 557ms/step - d_loss: 0.6997 - g_loss: 0.6750
Back to Table of Contents¶

6. Results and Discussion ¶


Now that the model has finished training over the course of 250 epochs, let's plot the loss values for both the Generator and Discriminator throughout training.

In [19]:
history_gan_df = pd.DataFrame({'D_Loss': history_gan.history['d_loss'], 'G_Loss': history_gan.history['g_loss']})

sns.lineplot(history_gan_df)
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('GAN Training History')
plt.axhline(y = 0.7, c = 'k', alpha = 0.4)
plt.savefig('./Generated_Images/GAN_History.jpg')
plt.show()
/opt/conda/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/opt/conda/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):

A few things to note:

  • The thin black line was plotted across the y-axis at 0.7 which is typically where both models "fight" around. Values further away from this line indicate one model is outperforming/underperforming the other. While values outside this range should be present, when one model stays far from this 0.7 line for too long it indicates an issue within the model.
  • We can see where if one model performs well (low loss value), the other receives a spike in loss.
  • During the first 75 epochs both models but especially the Generator's loss oscillated pretty heavily.
  • The models finally converged and became quite stable sometime after 150 epochs with much smaller oscillations.

It's also important to note that while the loss stabilized, it still jitters which isn't necessarily a bad thing. This means each model is potentially still learning. However, again, this process is very subjective and close analysis is the only thing that can be done to determine when to stop training the model. 250 epochs was chosen after much trial an error, analyzing the generated images at each epoch.

Speaking of which, now let's plot a generated image from every 5 epochs.

In [20]:
# Load generated images from each epoch.
gen_images_files = os.listdir(f"{current_wdir}/Generated_Images")
gen_images = {}

for file in gen_images_files:
    # Skip non-image files.
    if '.jpg' not in file:
        continue
    # Separate epoch number, image number, and file extension.
    epoch, _ = file.split('_')
    if '0' not in _:
        continue
    img = Image.open(f"{current_wdir}/Generated_Images/{file}")
    gen_images[epoch] = np.asarray(img)
In [21]:
# Plot an image from every 5 epochs.
fig, ax = plt.subplots(int((len(range(5, epochs+1, 5)) / 5)), 5, figsize = (10,int(epochs/10)), sharex = True, sharey = True)
for i, idx in enumerate(range(5, epochs+1, 5)):
    ax[i // 5, i % 5].imshow(gen_images[f'{str(idx).zfill(3)}'])
    ax[i // 5, i % 5].set_title(f'{idx}')
plt.savefig('./Generated_Images/5_Epoch_Images.jpg')
plt.show()
  • Similar to the story the training loss history showed, we can see a clear difference in images from the early epochs to the end, not gaining a sharper focus until after 150.
  • We also see a sort of cross-hatch quality to the earlier epochs that generally disappears later, but some artifacts remain in the later epochs.
  • There is also quite a decent amount of variety here. While no sharp details come to focus, the main Monet-esque characteristics are coming through.

Now, let's take a closer look at the last 20 epochs to get a better idea of the generative abilities of this model.

In [22]:
# Plot a generated image from the last 20 epochs.
last_gen_img_idx = range(epochs-19, epochs+1, 1)
fig, ax = plt.subplots(4, 5, sharex = True, sharey = True)
for i, idx in enumerate(last_gen_img_idx):
    ax[i // 5, i % 5].imshow(gen_images[f'{str(idx).zfill(3)}'])
    ax[i // 5, i % 5].set_title(f'{idx}')
    ax[i // 5, i % 5].axis('off')
#fig.tight_layout()
plt.savefig('./Generated_Images/Last_20_Images.jpg')
plt.show()

These look great! When looking through the full Monet dataset you definitely can see where these generated images are taking qualities from.

Finally, let's talk about this model's "skeleton in the closet" so to speak...

It seems to suffer from Mode Collapse.

To demonstrate this we will randomly generate 6 latent vectors and input them through the Generator to generate 6 images from the trained and finished model.

In [23]:
fig, ax = plt.subplots(2, 3, sharex = True, sharey = True)
prediction = generator.predict(np.random.normal(size = (6,latent_dim)))
prediction = ((prediction + 1) * 127.5).astype(np.uint8)
for i in range(6):
    ax[i // 3, i % 3].imshow(prediction[i])
plt.show()
1/1 ━━━━━━━━━━━━━━━━━━━━ 2s 2s/step

It's immediately apparent that each of these images are in the same "style" with the same color pallette. At each epoch the model would switch "styles" and colors but only every produce one type during that epoch. This is typically indicative of Mode Collapse which usually shows itself in this way -- where the Generator produces a very limited set of outputs. Many things can cause Mode Collapse and is what makes training GANs so difficult as it can be quite prevalent.

Causes of Mode Collapse (in no particular order):

  1. The Discriminator dominates the Generator, thus stifling the training process by never allowing the Generator to learn from a wider range of samples.
  2. Small, unbalanced, or low diversity of samples in the training dataset.
  3. Poor tuning of hyperparameters. GANs are heavily affected by their hyperparameters.
  4. Choice of loss function. It has been found that Min-Max BCE is difficult to optimize in an adversarial model. Several alternatives exist that don't have as much of an issue with unstable gradients and thus mode collapse, such as Wasserstein loss used in a WGAN.
  5. Latent space that is too small or the method of randomization/noise. The latent space generated is the seed in which the Generator creates images so if the space isn't large or varied enough, it will not generate variety.
  6. Filter kernel size (too small can be an issue) and number of convolution layers.
  7. Normalization between layers. While generally agreed that normalization layers after convolution is good practice in DCGANs, the tuning and even type is debated. Several alternatives to Batch Normalization have shown to work well.
  8. Generator overfits the Discriminator. Essentially always fools the Discriminator, typically collapsing to one or two "modes".

More information on this subject and sources for some of this can be found in the Appendix at the end.

I built countless alterations to this final model to address some of these issues and I believe I narrowed it down quite well, but did not overcome this issue (more on what was done for the final submission for the competition at the end of this section). Below are a summary and general path I took to attempt to solve this issue.

Methods Attempted to Mitigate Mode Collapse:

  • Batch Normalization: Included after most convolution layers.
  • Layer Normalization: According to some sources, works better than batch normalization in GANs, however in this case made the model very unstable and unable to converge, exploding the gradients. More tuning is require, but did not pursue further.
  • Deeper Generator and Discriminator Networks: Added one more layer to the Discriminator and eventually two to the Generator (this improved performance, but not mode collapse.)
  • Larger Latent Space (128 to 256 to 512): In case this was the limiting factor, the latent vector size was increased several times.
  • Different Random Noise in Latent Space: Random values from normal distributions and uniform were tried. Normal seemed to produce better results for me.
  • Different Filter Kernel Sizes in both Generator: Went from 3 to 4 to 5. (5,5) performed the best but did not solve mode collapse.
  • One-Sided Label Smoothing for the Discriminator: Allowing for noise in the Discriminator labels for positive classes (1) down to 0.9 can help the Generator during training. You can find this in the GAN class, train_step function.
  • Remove All Seeds for Latent Vector Initialization: In an attempt for more randomness, worrying seeding the latent space was holding the generator back, these seeds were removed.
  • Adjust Beta 1 and 2 on Adam Optimizers for Generator and Discriminator: Improved performance quite a bit, but did not solve mode collapse.
  • Raise Learning Rate on Generator (0.0002), leaving Discriminators at 0.0001: This allows the Generator to learn faster while slowing down the Discriminator.
  • Flipped Generator Convolution Filter Number to Start at 512 and Decrease: Model still works, but didn't help.
  • Remove Bias from Upper Convolution Layers: Adjust parameter bias = false. Noticed many CycleGANs utilize this, but never found a source for the reason.
  • Larger Filters in Discriminator: In many examples found, the filter sizes are the same between both models so I made them match after increasing the Generator to (5, 5) earlier.
  • Countless hyperparameter adjustments.

There is quite a lot more information that I haven't touched on this subject, but let's leave it here for now and complete the project.

Now to prepare the submission.

In [24]:
# generated_images = generator.predict(np.random.normal(size = (7000,latent_dim)))
# generated_images = (generated_images + 1) * 127.5
# for i in range(7000):
#     img = tf.keras.utils.array_to_img(generated_images[i])
#     img.save(f"{current_wdir}/images/{epochs}_{i}.jpg")
In [25]:
import shutil
shutil.make_archive("/kaggle/working/images", 'zip', "/kaggle/working/images")
Out[25]:
'/kaggle/working/images.zip'

As mentioned above, even though the final model still suffered from mode collapse, upon analyzing the generated images during training, I decided there was still a way to submit a variety of images.

Each epoch saw the Generator produce images with a new style (mode), and since training of the final epochs saw minimal differences between these final epochs -- Images were generated and sampled the final 20 epochs.

Here is a sample of those images.

In [26]:
last_20_files = os.listdir(f"{current_wdir}/images")
rand_last_20_images = []

for file in np.random.choice(last_20_files, 100, replace = False):
    # Skip non-image files.
    if '.jpg' not in file:
        continue
    # Separate epoch number, image number, and file extension.
    epoch, num = file.split('_')
    img = Image.open(f"{current_wdir}/images/{file}")
    rand_last_20_images.append(np.asarray(img))

fig, ax = plt.subplots(20, 5, figsize = (10,40), sharex = True, sharey = True)
for i, img in enumerate(rand_last_20_images):
    ax[i // 5, i % 5].imshow(img)
plt.show()

And as a reminder of a few of the Monet paintings for comparison...

In [31]:
for img in monet:
    fig, ax = plt.subplots(2, 5, figsize = (10,4), sharex = True, sharey = True)
    img = ((img.numpy() + 1) * 127.5).astype(np.uint8)
    for i in range(10):
        ax[i // 5, i % 5].imshow(img[i])
    break

Is it going to fool any art buffs and sell for millions of dollars? - No. However, besides the remaining artifcating from the convolutions that makes it look like monet suddenly got into quilting, the generated images do look distinctly Monet-esque. In my opinion they capture the color blocking and combinations of his paintings and even some of his textures and way he layers paint.

While this seems like a bit of a work-around, all previously submitted models were only from generated images from the last epoch. You can see below that the previous version scored ~18 higher (Version 12, same architecture only difference is images submitted and randomness, of course). Also, we can see a few previous versions of this model before making the adjustments listed above.

Below you can find a screenshot of all the results from the Kaggle submissions.

results

Alongside MiFID scores let's look at a sample of a generated image from each of these model versions.

results
  • Version 5 received a MiFID score of 335 this model was shallow with only a handful of convolution layers in the Generator, and did not converge on anything visibly Monet-like.

results
  • Version 6 with the score of 363 was deeper than version 5, with 5 convolutional layers.

results
  • Version 7 with the score of 206 finally showed a leap in improvement and included many of the changes listed above. Including 6 convolutional layers, a larger latent space, smoothed discriminator labels, etc.

results
  • Version 12 is the same architecture and hyperparameters as the ones in this final notebook, but only generated images from Generator after training completed (after the final epoch).

[See Images Above]
  • Version 16, the final model, scored 128.65 which submitted the images within this notebook (above). Again, this architecture is the same as version 12, but with added variety from sampling images of the last 20 epochs for submission.
Back to Table of Contents¶

7. Conclusion ¶


Ultimately, the performance of the resultant DCGAN through the iterative process described within this project improved to a respectable level. Receiving a score of ~128.65 and placing 53rd is a good result in this case. Through further tweaks the mode collapse that this model suffered from can be overcome and increase performance further.

Of course, a model that produces a true style-transfer like pix2pix/CycleGAN, training a Generator that uses an auto-encoder or UNET architecture that is able to receive an image as an input and generate from that image, is going to score better than a DCGAN in this competition, but I found this exercise of attempting randomly learned generation to be a valuable one.

More specifically, the difficulties in training a GAN can come from all directions. Including:

  • The dataset should be large and varied which in this case was overcome through random image augmentations/transformations every time the dataloader is called to load a batch during training.
  • The architecture of both the Discriminator and Generator not only need to perform well, but complement each other so that one is not "weaker" than the other. This can come through the layers themselves and/or the optimizers, learning rates, and other hyperparameters. Also, deeper networks outperform shallow ones, judging from these results at least.
  • Intermediate layer normalization is essential to GAN convergence. This can reduce likelihood of mode collapse and intra-batch correlations effecting image generation.
  • Random noise is the basis for properly seeding multiple aspects of GANs and is important to get correct and tweak: Within the latent space it effects the generative abilities of the model and it's variety. Within the discriminator labels, it affects the training process and resilience to overfitting.

GANs can be difficult to build and tune properly. The subjective nature of building and training a DCGAN, or any GAN for that matter, that generates Monet-esque images from random noise latent vectors is its own kind of art.

7.1. Possible Areas for Improvement ¶

  • Solve Mode Collapse
    • Self-Modulation Layers instead of Batch Normalization
    • Properly tune Layer Normalization
    • Mini-batch training images.
    • Change model to WGAN.
  • Analyze a variety of images vs MiFID score to gain a better intuition for the distance metric.

Appendix A - Online References: ¶

Resources that helped along the way in no particular order.

  • Great tutorial for image generation using GANs https://machinelearningmastery.com/how-to-develop-a-generative-adversarial-network-for-a-cifar-10-small-object-photographs-from-scratch/
  • Starting point for DCGANs https://keras.io/examples/generative/dcgan_overriding_train_step/
  • Solution to Mode Collapse without changing model? https://arxiv.org/pdf/1810.01365
  • Identifying unstable GANs and their failure modes https://machinelearningmastery.com/practical-guide-to-gan-failure-modes/

    Exported to HTML via command line using:

  • jupyter nbconvert Adversarial_Painting.ipynb --to html

  • jupyter nbconvert Adversarial_Painting.ipynb --to html --HTMLExporter.theme=dark
Back to Table of Contents¶